Embedded Languages for Data-Parallel Programming

نویسنده

  • Bo Joel Svensson
چکیده

Computers today are becoming more and more parallel. General purpose processors (CPUs) have multiple processing cores and Single Instruction Multiple Data (SIMD) units for data-parallelism. Graphics processors (GPUs) bring massive parallelism at the cost of being harder to program than CPUs. This thesis applies embedded language methodology to data-parallel programming. Two embedded languages are presented, Obsidian for general purpose GPU programming and EmbArBB for dataparallel programming across platforms. CPUs and GPUs get more parallel resources with each new generation. The question of how to efficiently program these processors arises. We are after efficiency both in programmer productivity and in application performance. Using embedded languages allows us to experiment with what abstractions to present to the programmer at relatively little effort. Obsidian is an embedded language for general purpose programming of GPUs. We try to strike a balance between high level, productivity increasing abstractions and low-level control needed for performance. The Obsidian programming model mirrors the GPU architecture and the programmer is constrained into writing GPUfriendly code. Hierarchy level polymorphic library functions are supplied to make these constraints feel less obtrusive. Obsidian programs are compiled into CUDA C code. This compilation is based on a simple and elegant monad reification technique. In cases where the programmer is not interested in low-level details or wants the program to run over a range of hardware, a higher level language can be used. EmbArBB is a Haskell embedding or the Intel ArBB system. EmbArBB relies on the ArBB system to generate code (via a Just-In-Time compiler) to a range of hardware. EmbArBB embeds a preexisting library for data-parallelism into Haskell and we obtain very good performance at little implementation effort. This performance comes from the expertise and effort put into the ArBB system and that we get for free. Embedding ArBB is a way to provide these benefits to the Haskell programmer and a way to increase usefulness of an existing system by opening it up to a wider audience. Obsidian is very different; it is not based on a set of high-level parallel primitives. The Obsidian programmer can implement these primitives in different ways and then select the best one. We have obtained very good performance in case studies involving reductions. Obsidian programs are also more terse and composable, compared to CUDA.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Productive High Performance Parallel Programming with Auto-tuned Domain-Specific Embedded Languages

Productive High Performance Parallel Programming with Auto-tuned Domain-Specific Embedded Languages

متن کامل

Agda Meets Accelerate

Embedded languages in Haskell benefit from a range of type extensions, such as type families, that are subsumed by dependent types. However, even with those type extensions, embedded languages for data parallel programming lack desirable static guarantees, such as static bounds checks in indexing and collective permutation operations. This observation raises the question whether an embedded lan...

متن کامل

General-purpose Graphics Processing Units Deliver New Capabilities to the Embedded Market

Today’s graphics processors are highly programmable, massively parallel compute engines. With the development of open, industry standards, parallel programming languages such as OpenCLTM and the continued evolution of heterogeneous computing, general-purpose graphics processing units (GPGPUs) offer exciting new capabilities for the embedded market. This paper examines some of the industry facto...

متن کامل

Programming and compiling for embedded SIMD architectures

Declaration This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration except where specifically indicated in the text. I confirm that this dissertation, including tables and footnotes, but excluding appendices, bibliography and diagrams, does not exceed the regulation length of 60 000 words. Summary This dissertation studies programmi...

متن کامل

Using transition traces to model a security protocol

Security protocols are often difficult to specify formally and hard to prove correct, because of the potentially complex patterns of interaction between processes executing in parallel. Many people have proposed the use of formal methods in such applications (cf. [13, 14, 15, 16, 18, 23, 24]). For example, Roscoe and his colleagues have used the model checker FDR [10], based on the failures-div...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013